Adaptation of Relation Extraction Rules to New Domains

نویسندگان

  • Feiyu Xu
  • Hans Uszkoreit
  • Hong Li
  • Niko Felger
چکیده

This paper presents various strategies for improving the extraction performance of less prominent relations with the help of the rules learned for similar relations, for which large volumes of data are available that exhibit suitable data properties. The rules are learned via a minimally supervised machine learning system for relation extraction called DARE. Starting from semantic seeds, DARE extracts linguistic grammar rules associated with semantic roles from parsed news texts. The performance analysis with respect to different experiment domains shows that the data property plays an important role for DARE. Especially the redundancy of the data and the connectivity of instances and pattern rules have a strong influence on recall. However, most real-world data sets do not possess the desirable small-world property. Therefore, we propose three scenarios to overcome the data property problem of some domains by exploiting a similar domain with better data properties. The first two strategies stay with the same corpus but try to extract new similar relations with learned rules. The third strategy adapts the learned rules to a new corpus. All three strategies show that frequently mentioned relations can help in the detection of less frequent relations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction

Relation Extraction (RE) is the task of extracting semantic relationships between entities in text. Recent studies on relation extraction are mostly supervised. The clear drawback of supervised methods is the need of training data: labeled data is expensive to obtain, and there is often a mismatch between the training data and the data the system will be applied to. This is the problem of domai...

متن کامل

Robust Domain Adaptation for Relation Extraction via Clustering Consistency

We propose a two-phase framework to adapt existing relation extraction classifiers to extract relations for new target domains. We address two challenges: negative transfer when knowledge in source domains is used without considering the differences in relation distributions; and lack of adequate labeled samples for rarer relations in the new domain, due to a small labeled data set and imbalanc...

متن کامل

Bootstrapping relation extraction from semantic seeds

Information Extraction (IE) is a technology for localizing and classifying pieces of relevant information in unstructured natural language texts and detecting relevant relations among them. This thesis deals with one of the central tasks of IE, i.e., relation extraction. The goal is to provide a general framework that automatically learns mappings between linguistic analyses and target semantic...

متن کامل

Domain Adaptation for Relation Extraction with Domain Adversarial Neural Network

Relations are expressed in many domains such as newswire, weblogs and phone conversations. Trained on a source domain, a relation extractor’s performance degrades when applied to target domains other than the source. A common yet labor-intensive method for domain adaptation is to construct a target-domainspecific labeled dataset for adapting the extractor. In response, we present an unsupervise...

متن کامل

Domain-Neutral Relation Characterisation: Evaluation on Disease-Treatment Data

Adapting conventional supervised relation extraction (RE) systems to new domains requires significant effort from annotators and developers. Thus, we propose models for relation characterisation – the subtask of RE that assigns types to extracted relations – that have domain adaptation costs of zero. Development experiments on newswire text compare dimensionality reduction techniques and show t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008